feat(B-0183 Phase 1 sibling): TS wrapper for Alloy + fix AlloyRunner.java API (latent silent-no-op fix) by AceHack · Pull Request #1413 · Lucent-Financial-Group/Zeta

AceHack · 2026-05-03T14:01:09Z

Summary

Sibling of #1412 (TLC wrapper). Writing the TS wrapper surfaced a latent silent-no-op in F# Alloy tests — same bug class as the cache-clobber.

Latent bug class (lucky catch)

Aaron 2026-05-03: "so it was not working before, not sure how that got missed :)"

AlloyRunner.java:38 references A4Options.SatSolver.SAT4J — nested enum doesn't exist in Alloy 6.2.0 (current jar)
F# compileRunnerIfNeeded() returns false on compile failure; toolchainReady() returns false; assertSpecValid() silently no-ops; xunit reports tests as PASSED
TS wrapper fails loudly with exit 2 (toolchain-not-ready) — exactly the structural advantage B-0183 promises

Fix

Removed opts.solver = A4Options.SatSolver.SAT4J; line. Default resolves to SAT4J on Alloy 6.x without explicit assignment. Added comment explaining the upstream rename.

Verification

AlloyRunner.java compiles clean
run-alloy.ts --check-toolchain → OK
run-alloy.ts InfoTheoreticSharder → OK (real Alloy validation now)
run-alloy.ts Spine → FAIL (separate spec bug surfaced — Spine.als type error at line 35 col 25 under Alloy 6.2.0; filing as backlog row)

Why "right language for the job"

Aaron 2026-05-03: "f# is not really make for calling command line calls, the strong functional nature makes it awkward. this is using the right language for the job in action."

Test plan

AlloyRunner.java compiles
Wrapper exit codes correct (0/1/2)
gitignore tools/alloy/classes/

🤖 Generated with Claude Code

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b0cff9cba8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copilot

Pull request overview

Adds a Bun/TypeScript CLI wrapper to run Alloy specs directly (mirroring the TLC wrapper approach) and fixes a latent Alloy 6.x incompatibility in the Java runner so Alloy validations actually execute instead of silently no-op’ing.

Changes:

Add tools/formal-verification/run-alloy.ts wrapper with --check-toolchain, --all, and per-spec invocation modes.
Fix tools/alloy/AlloyRunner.java by removing the obsolete explicit SAT4J enum assignment (Alloy 6.x API change).
Ignore tools/alloy/classes/ to keep compiled runner artifacts out of git.

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 2 comments.

File	Description
tools/formal-verification/run-alloy.ts	New Bun/TS CLI to compile/run AlloyRunner and validate `.als` specs with explicit exit codes.
tools/alloy/AlloyRunner.java	Removes reference to a non-existent solver enum in Alloy 6.x; documents upstream change.
.gitignore	Ignores compiled AlloyRunner output directory.

…process which + exit code 3 + fail-on-missing + failure-details Same set of #1412/#1413 review fixes applied to the Alloy wrapper: - Drop unused readdirSync, readFileSync, unlinkSync imports (tsc lint failure on #1413) - Replace /usr/bin/env which shell-out with in-process PATH-scan (matches F# AlloyRunnerTests pattern; portable to Windows) - Exit code 3 for unknown-flag (orthogonal from exit 2 = toolchain not ready) - Treat missing-from-catalogue as failure not skip (catalogue drift gate) - Print per-failure stdout-tail + stderr-tail in --all mode for CI triage - Tightened ExitCode union type per Aaron 2026-05-03 strong-typing guidance Local verification: --check-toolchain OK; InfoTheoreticSharder passes cleanly; --bad-flag returns exit 3.

…026-05-03) (#1414) Captures the six author-time disciplines extracted from #1412/#1413 review cycles. Per Aaron 2026-05-03 'any classes to learn to code better next time via meta learning at PR?' + 'we want to be strongly typed to a large degree in ts it verifies your assumptions more at lint time'. Six classes: 1. Pre-push lint discipline (bun tsc --noEmit catches unused imports + typing; CodeQL self-audit for regex injection) 2. Sibling-pattern audit before authoring (when porting F#→TS, read F# patterns first) 3. Verify-then-claim applied to comments ('X matches Y' claims need grep verification) 4. Exit-code orthogonality (one code = one semantic; document contract first; enforce via union types) 5. DX-think on failure paths (failure UX is part of contract; print rerun command + stdout-tail) 6. Strong typing as assumption-verifier (union types over numbers; brand types over strings; readonly default) Triggering cases per class documented (each from real #1412/#1413 review threads). Composes with chat-is-assertion-channel + cache-paths-mutually- exclusive + the broader verify-then-claim cluster. MEMORY.md pointer added newest-first. Carved sentence: 'Apply the review feedback at author-time next time. Each review thread that closes is a meta-class to internalize, not just a fix to land.'

…move F# Skip Reviewer #1415 caught two related issues: 1. Skip removes only CI check for Spine.als (regression after #1413 landed AlloyRunner.java compile fix; spec was about to be on active CI surface) 2. B-0184 'Why P2' was inaccurate — post-#1413 the spec IS running Aaron 2026-05-03: 'fix them like you normally would'. So instead of deferring with Skip + P2, fix the spec inline + remove Skip + bump B-0184 to P1 + close. Two-part spec fix: 1. Parens around sum-comprehension at line 35: (sum b : l.batches | b.size) <= mul[2, cap[l.level, maxCap]] Matches comment intent 'total size at level i ≤ 2 * cap(i)' (sum-then-compare, not sum-of-Booleans). 2. Replaced check command with fact + run: - Original: `check SizeDoublingHolds { SizeDoubling[1] }` — asks 'does property hold for ALL bounded instances?' Alloy trivially constructs counterexamples by allowing arbitrary Int batch sizes (no constraint). - Fix: added `fact NonNegativeBatchSizes` (real LSM spines have non-negative element counts) + replaced check with `run SizeDoublingAdmitsInstance` for constructive existence proof. `7 Int` bitwidth bump prevents cap-function overflow. Local verification: `java -jar alloy.jar exec -f Spine.als` returns `00. run SizeDoublingAdmitsInstance 0 1/1 SAT` (instance found). AlloyRunner translates SAT-on-run to OK. F# Skip removed; Spine [Fact] back on active CI surface. Backlog row moved P2/ → P1/ to match the bumped priority. Status flipped to closed. Resolution + 3 discipline lessons appended: 1. Check-vs-run-vs-fact semantic confusion (the substantive spec-design lesson) 2. Alloy Int bitwidth must accommodate cap-function range 3. Coverage-loss-is-regression: when a fix becomes part of the active CI surface, deferring loses coverage rather than just defers it (the reviewer's P2→P1 framing was right) .gitignore Spine/ (Alloy run-output directory). Composes with #1413 (the AlloyRunner.java fix that surfaced this) + #1414 (PR-review meta-classes memory file — discipline lesson #3 above is a fresh instance of the meta-class).

…t bug surfaced by #1413) (#1415) * fix+backlog: skip Spine F# Alloy test pending B-0184 spec fix The pre-#1413 cache-clobber + AlloyRunner.java compile failure was silently no-op'ing all Alloy tests. Once the cache narrowed (#1404) and AlloyRunner.java fixed (#1413), CI actually runs Alloy now — which surfaces the latent Spine.als spec bug at line 35 (Alloy 6.2.0 type-check error: sum-comprehension expects Int but finds Boolean). Two parts in this PR: 1. **B-0184 backlog row** filed — captures the spec bug + likely intent (sum vs all comprehension confusion) + fix path + the silent-skip lineage. P2 since it's pre-existing latent failure surfaced post-cleanup. 2. **F# Alloy.Runner.Tests.fs**: add Skip="..." attribute to the Spine [Fact] so CI doesn't fail until B-0184 lands. The skip message references B-0184 so future-Otto can find the row. Test result post-fix: 2 passed (InfoTheoreticSharder + jar- installed), 1 skipped (Spine pending B-0184). Build clean (0/0). Composes with B-0183 Phase 1 (#1413 surfaced this bug class) + B-0183 Phase 3 retirement of F# Alloy tests — Spine.als fix needs to land before TS wrapper retires F# tests so the wrapper passes. * fix(B-0184): two-part spec fix — parens + check→fact+run + P2→P1 + remove F# Skip Reviewer #1415 caught two related issues: 1. Skip removes only CI check for Spine.als (regression after #1413 landed AlloyRunner.java compile fix; spec was about to be on active CI surface) 2. B-0184 'Why P2' was inaccurate — post-#1413 the spec IS running Aaron 2026-05-03: 'fix them like you normally would'. So instead of deferring with Skip + P2, fix the spec inline + remove Skip + bump B-0184 to P1 + close. Two-part spec fix: 1. Parens around sum-comprehension at line 35: (sum b : l.batches | b.size) <= mul[2, cap[l.level, maxCap]] Matches comment intent 'total size at level i ≤ 2 * cap(i)' (sum-then-compare, not sum-of-Booleans). 2. Replaced check command with fact + run: - Original: `check SizeDoublingHolds { SizeDoubling[1] }` — asks 'does property hold for ALL bounded instances?' Alloy trivially constructs counterexamples by allowing arbitrary Int batch sizes (no constraint). - Fix: added `fact NonNegativeBatchSizes` (real LSM spines have non-negative element counts) + replaced check with `run SizeDoublingAdmitsInstance` for constructive existence proof. `7 Int` bitwidth bump prevents cap-function overflow. Local verification: `java -jar alloy.jar exec -f Spine.als` returns `00. run SizeDoublingAdmitsInstance 0 1/1 SAT` (instance found). AlloyRunner translates SAT-on-run to OK. F# Skip removed; Spine [Fact] back on active CI surface. Backlog row moved P2/ → P1/ to match the bumped priority. Status flipped to closed. Resolution + 3 discipline lessons appended: 1. Check-vs-run-vs-fact semantic confusion (the substantive spec-design lesson) 2. Alloy Int bitwidth must accommodate cap-function range 3. Coverage-loss-is-regression: when a fix becomes part of the active CI surface, deferring loses coverage rather than just defers it (the reviewer's P2→P1 framing was right) .gitignore Spine/ (Alloy run-output directory). Composes with #1413 (the AlloyRunner.java fix that surfaced this) + #1414 (PR-review meta-classes memory file — discipline lesson #3 above is a fresh instance of the meta-class).

Backlog generator drift catch from PR #1416 — `bun tools/backlog/generate-index.ts` needs to be re-run after moving a row between priority directories. Same author-time class as B-0184 P2->P1 move (same #1413/#1415 cycle). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…java API + gitignore classes/ Sibling of run-tlc.ts. Same shape: detect toolchain, shell to java -cp alloy.jar:classes/ AlloyRunner <SpecFile>, parse output for OK/FAIL markers, exit accordingly. **Latent bug surfaced (lucky catch class) by writing the wrapper:** Aaron 2026-05-03: "so it was not working before, not sure how that got missed :)". Confirmed: 1. **AlloyRunner.java compile failure**: line 38 references A4Options.SatSolver.SAT4J — that nested enum no longer exists in alloy.jar v6.2.0 (current). The shipped jar exposes opts.solver as kodkod.engine.satlab.SATFactory; default solver is SAT4J already. 2. **F# tests silently passed via skip-on-error**: when compileRunnerIfNeeded() failed, toolchainReady() returned false, assertSpecValid silently no-op'd. xunit reported all 3 Alloy tests as PASSED. Same class as the cache-clobber bug — silent pass via skip-on-error. 3. **TS wrapper fails loudly** instead: exit 2 (toolchain not ready) when compile fails. This is exactly the structural advantage B-0183 promises — TS wrappers don't have the silent- no-op trap. **Fix in this PR:** - Removed `opts.solver = A4Options.SatSolver.SAT4J;` line — default resolves to SAT4J on Alloy 6.x without explicit assignment. - Added explanatory comment about the upstream rename so future- Otto doesn't re-add the line. **Verification post-fix:** - AlloyRunner.java compiles clean - run-alloy.ts --check-toolchain → OK - run-alloy.ts InfoTheoreticSharder → OK - run-alloy.ts Spine → FAIL (separate spec bug — Spine.als has a type-check failure under Alloy 6.2.0: line 35 col 25 expects integer but gets Boolean. Likely `sum` should be `all`. Filing as separate backlog row.) **.gitignore**: added tools/alloy/classes/ (compiled .class files; F# tests + TS wrapper both materialize them on first run). Phase 2 (next PR) wires both run-tlc.ts + run-alloy.ts into a Linux-only CI workflow. Phase 3 retires F# tests.

…process which + exit code 3 + fail-on-missing + failure-details Same set of #1412/#1413 review fixes applied to the Alloy wrapper: - Drop unused readdirSync, readFileSync, unlinkSync imports (tsc lint failure on #1413) - Replace /usr/bin/env which shell-out with in-process PATH-scan (matches F# AlloyRunnerTests pattern; portable to Windows) - Exit code 3 for unknown-flag (orthogonal from exit 2 = toolchain not ready) - Treat missing-from-catalogue as failure not skip (catalogue drift gate) - Print per-failure stdout-tail + stderr-tail in --all mode for CI triage - Tightened ExitCode union type per Aaron 2026-05-03 strong-typing guidance Local verification: --check-toolchain OK; InfoTheoreticSharder passes cleanly; --bad-flag returns exit 3.

chatgpt-codex-connector · 2026-05-03T15:04:41Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Backlog generator drift catch from PR #1416 — `bun tools/backlog/generate-index.ts` needs to be re-run after moving a row between priority directories. Same author-time class as B-0184 P2->P1 move (same #1413/#1415 cycle). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 2 out of 3 changed files in this pull request and generated no new comments.

… -> A: 4 of 4 deferred specs in CI) (#1416) * fix(tla): SpineMergeInvariants counterexample (B-0181 P2->P1 closed) Cascade(i) was under-specified: only required levels[i] >= Cap(i), no constraint on level i+1. TLC found a 16-step trace where Cascade(0) fires 5 times in a row, accumulating level 1 to 10 > 2*Cap(1) = 8 even though Cascade(1) was enabled the whole time. Fix mirrors the synchronous cascade chain in BalancedSpine.fs: add levels[i+1] + levels[i] <= 2 * Cap(i+1) precondition so a level can't dump while its downstream is at the cap-overshoot boundary. Added WF_vars(Cascade(i)) for liveness. Added StateBound + PendingBound constraints with bounded constants (MaxLevel=2 / MaxBatchSize=1) so TLC's BFS terminates: 400 distinct states, depth 45, both InvCap and InvMass hold. Wired CI registration: [<Fact>] TLC validates SpineMergeInvariants in tests/Tests.FSharp/Formal/Tlc.Runner.Tests.fs. Closes B1 -> A in the math-proofs honest assessment matrix (4 of 4 deferred TLA+ specs now in CI). Promoted P2 -> P1 because same author-time class as B-0184 (Alloy Spine.als under-specified pred): under-specified action preconditions across two different formal-verification tools. .gitignore: covered both run-from-tools/tla and run-from-tools/tla/specs TLC state directories + TTrace files. Memory: feedback_under_specified_action_preconditions_recurring_class_in_formal_specs_aaron_2026_05_03.md captures the recurring class with author-time precondition-audit discipline. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(backlog): regenerate index after B-0181 P2->P1 move Backlog generator drift catch from PR #1416 — `bun tools/backlog/generate-index.ts` needs to be re-run after moving a row between priority directories. Same author-time class as B-0184 P2->P1 move (same #1413/#1415 cycle). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * review(B-0181): address Copilot feedback on #1416 Two of three Copilot comments addressed (third — BACKLOG.md regen — was already addressed by 9350572 / commit-2 of this branch): 1. Updated StateBound's comment in SpineMergeInvariants.tla to reflect actual model constraints (PendingBound also active in cfg) — drops the misleading "16-slot pendingIn = 30" calculation that was based on Accept's loose internal bound rather than the effective cfg constraints. Both bounds now jointly justified. 2. Replaced the closed row's "Investigation needed" section with "Investigation summary (resolved)" — past-tense triage of the three failure classes plus which one applied (none of spec-incorrect / invariant-over-spec / real-BalancedSpine-bug; the actual class was under-specified action precondition). TLC still clean: 400 distinct states, depth 45, both InvCap and InvMass hold. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

… shard (#1418) * hygiene(tick-history): 2026-05-03T15:12Z session-summary tick B-0181 SpineMergeInvariants closure (#1416 merged) + B-0183 Phase 1 sibling Alloy TS wrapper landed (#1413 merged after rebase) + Stryker B3 workflow opened (#1417). Math-proofs assessment matrix: B1 -> A fully closed (4/4 deferred TLA+ specs in CI); B3 in-flight closure now pending. Discipline lesson encoded: under-specified-action-preconditions as recurring class across formal-verification tools (TLA+ + Alloy). Author-time precondition-audit is the structural fix. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(tick-shard): escape glob ** to satisfy markdownlint MD037 src/Core/** + tests/Tests.FSharp/** rendered the ** as bold-end with a space, tripping MD037 no-space-in-emphasis. Backtick-quote the path patterns to suppress markdown emphasis interpretation. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * review(tick-shard): use cron-id (a2e2cc3a) in col3 per shard schema Reviewer caught: 3rd column documented as cron sentinel/id, not action summary. Move "post-1330Z session compaction recovery + B-0181 closure" into the body column where it belongs. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings May 3, 2026 14:01

AceHack enabled auto-merge (squash) May 3, 2026 14:01

Copilot started reviewing on behalf of AceHack May 3, 2026 14:01 View session

github-code-quality Bot found potential problems May 3, 2026

View reviewed changes

Comment thread tools/formal-verification/run-alloy.ts Fixed

chatgpt-codex-connector Bot reviewed May 3, 2026

View reviewed changes

Comment thread tools/formal-verification/run-alloy.ts Outdated

Comment thread tools/formal-verification/run-alloy.ts Outdated

Copilot AI reviewed May 3, 2026

View reviewed changes

Comment thread tools/formal-verification/run-alloy.ts Outdated

Comment thread tools/formal-verification/run-alloy.ts Outdated

AceHack mentioned this pull request May 3, 2026

memory: PR-review meta-classes — six author-time disciplines (Aaron 2026-05-03) #1414

Merged

3 tasks

AceHack mentioned this pull request May 3, 2026

fix+backlog(B-0184): skip Spine F# Alloy test pending spec fix (latent bug surfaced by #1413) #1415

Merged

AceHack added 2 commits May 3, 2026 11:03

AceHack force-pushed the feat/b0183-phase1-alloy-ts-wrapper branch from c9e17b7 to ace17c5 Compare May 3, 2026 15:04

Copilot AI review requested due to automatic review settings May 3, 2026 15:04

Copilot started reviewing on behalf of AceHack May 3, 2026 15:04 View session

AceHack merged commit 3483922 into main May 3, 2026
29 checks passed

AceHack deleted the feat/b0183-phase1-alloy-ts-wrapper branch May 3, 2026 15:07

Copilot AI reviewed May 3, 2026

View reviewed changes

AceHack mentioned this pull request May 3, 2026

hygiene(tick-history): 2026-05-03T15:12Z B-0181 + B-0183 + B3 session shard #1418

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(B-0183 Phase 1 sibling): TS wrapper for Alloy + fix AlloyRunner.java API (latent silent-no-op fix)#1413

feat(B-0183 Phase 1 sibling): TS wrapper for Alloy + fix AlloyRunner.java API (latent silent-no-op fix)#1413
AceHack merged 2 commits intomainfrom
feat/b0183-phase1-alloy-ts-wrapper

AceHack commented May 3, 2026

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot commented May 3, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AceHack commented May 3, 2026

Summary

Latent bug class (lucky catch)

Fix

Verification

Why "right language for the job"

Test plan

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot commented May 3, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants